Contemporary Historians and the Reuse of Social Science Generated Data Sets: An International Dialogue on the Challenges Presented by "Social Data"

Contemporary Historians and the Reuse of Social Science Generated Data Sets: An International Dialogue on the Challenges Presented by "Social Data"

Organisatoren
Lutz Raphael, Trier University; Sabine Reh, Research Library for the History of Education, BBF-DIPF Berlin; Pascal Siegers, GESIS Leibniz Institute for the Social Sciences; Kerstin Brückweh, Berliner Hochschule für Technik, BHT; Christina von Hodenberg, GHIL
Ort
London
Land
United Kingdom
Vom - Bis
28.10.2021 - 30.10.2021
Url der Konferenzwebsite
Von
Clemens Villinger, GESIS - Leibniz Institute for the Social Sciences, Mannheim

The third workshop of the DFG funded project “Social Science Data as Sources for Contemporary History” aimed to establish an international dialogue between historians, sociologists, and representatives of the infrastructure that collects and provides access to social science generated data sets. In her introductory remarks, CHRISTINA VON HODENBERG (London) emphasized the value of international exchange for a reflection of the different approaches employed by contemporary historians who analyse and incorporate social science data into their research. In addition to addressing such methodological questions, the workshop provided an opportunity to gain a deeper understanding of how research data infrastructures process, archive, and make accessible social science data in countries across Europe.

The first panel focused on the reuse of qualitative and life history interviews. In her presentation, the sociologist JANE GRAY (Maynooth) introduced her research on “family rhythms”, reusing and combining archived qualitative social science data from the Life Histories and Social Change Collection, and drawing on semi-structured interviews conducted during the national longitudinal study of children Growing Up in Ireland (GUI). Both data sets have been deposited in the Irish Qualitative Data Archive and are now maintained and disseminated by the Digital Repository of Ireland. According to Gray, working across datasets using descriptive approaches and mixing them with other historical sources such as quantitative data allows an analysis of changing relationships between children and their grandparents across extended periods of time. Gray discussed the implications of a “descriptive turn” in the social sciences which draws on diverse sources (such as qualitative records and social media data) to present complex social phenomena and social change, thus overcoming the limitations of traditional surveys.

CLEMENS VILLINGER (Berlin) explained how he reused interviews conducted by social scientists in East Germany during the 1990s to write a history of consumption from an everyday perspective. He identified three major obstacles to his research: first, the interviews were hard to locate because they remained in the personal archives of the interviewers; second, accessibility depended on personal sympathies; and third, different ethical views exist about whether they can be reused at all. He called for a code of ethics to be drawn up to make it easier for historians to reuse social science data responsibly, and to reduce the associated costs. Taking his own research on the attribution of consumer responsibility after 1989–90 as an example, Villinger argued that the benefits of reusing interviews outweigh the challenges while appropriate support was missing to reduce the burden of analysing existing social science research data.

The third paper was presented by MARY STEWART and CHARLIE MORGAN (both London). Unlike in Germany, the reuse of interviews has played a key role in the British oral history movement since its beginnings in the 1970s. This is why the Sound Archive aims to make as much qualitative data as possible available not only to scientists and academics, but also to the media, artists, and families. Before the public reuse of older collections is permitted, the General Data Protection Regulation (GDPR) requires the archive to identify personal data about living people and to evaluate whether its public release is likely to cause “substantial damage and distress”. The nature of the interviews requires an elaborate Lexicon search engine to identify sensitive passages. This can itself be reused to make the collection searchable and therefore more accessible. Unlike in Germany, the British Library Sound Archive data sets are not anonymized, which means that information is not lost when they are used for research.

In her comment KERSTIN BRÜCKWEH (Berlin) suggested establishing a Help Desk for historians dealing with ethical questions. She also raised the question of whose history we are writing if interviews of “ordinary people” are less easily accessible than interviews of “movers and shakers” such as politicians.

The second panel focused on survey data as a source for social history and started with a presentation by MOR GELLER (Jerusalem) about the KINO-DDR social science research project carried out by the East German Central Institute for Youth Research. The project was designed to elicit viewers’ opinions of socialist films. From a history of knowledge perspective, Geller demonstrated the complex relations between the social scientists, the survey, and the participants, which she characterized as “a double-ended line of communication”. She argued that opinion polls can be used as a historical source to open up ways of studying the relationship between the socialist state and its population.

MARCUS BÖICK (Bochum) spoke about his analysis of interviews with managers working for the Treuhandanstalt (Trust Agency, set up in 1990 to privatize East German enterprises) which were conducted by an ethnologist in 1992. After tracking them down in the personal possession of a former employee, Böick managed to retrieve the interview transcripts from a number of floppy discs. He used them to write a social microhistory of the Treuhandanstalt from the perspective and experience of the mostly West German managers. By identifying narrative patterns, Böick managed to create types such as the “industry manager” and topoi such as the self-definition as pioneers working on the economic frontier of the “wild east”. Böick highlighted open questions concerning the use of data sets rediscovered by historians when there are no guidelines for their appropriate use.

MORITZ J. FEICHTINGER (Bern) then introduced his work on quantification practices used to monitor, model, and manipulate societies. He drew upon the Hamlet Evaluation System (HES) used during the Vietnam War as an example. To understand and analyse computing techniques dating from the 1960s and 1970s, Feichtinger engages with a process he calls data “re-enactment”, which consists of five steps: the conversion of data into a readable format; the creation of a data lifecycle model; the annotation of converted data sets; the simulation (or mimicking) of historic update, maintenance, aggregation, and query routines; and finally, publication as a web-based simulation. According to Feichtinger, this approach allows a deeper understanding of how the use of data shaped (military) representations of the world that not only influenced decision-making and policy-making processes, but also had a tangible impact on the Vietnamese people.

The comment was given by Christina von Hodenberg. She asked what theoretical, ethical, and practical aspects need to be considered when reusing social science data produced in dictatorships, wars, or colonial contexts. During the discussion both Marcus Böick and PASCAL SIEGERS (Cologne) emphasized that errors, biases, and self-censorship are typical of data production in all political contexts. The second part of the discussion revolved around the fundamental question of whether it makes sense to reuse social science data if they do not allow to challenge established historical narratives.

The second day started with a presentation by IRENA SALENIECE (Daugavpils) on oral history interviews with Latvian teachers that are archived in the Centre of Oral History established in 2003. Saleniece is both conducting new interviews and reusing existing qualitative data sets to write an experiential history of the Sovietization of the Latvian school system between the 1940s and 1960s. For her, oral history interviews with diffent generations of (often bi- and trilingual) Latvians serve to counteract the record from the state archives which during the Soviet period falsified facts and silenced inconvenient voices. She focusses on emotional, episodic and bottom-up perspectives to break through the standardised ways of “bolshevik speak”.

The director of the Mass Observation Archive, FIONA COURAGE (Sussex) gave an introduction to the history and holdings of the archive, but also on her own research on the value of higher education. The initial mass observation project ran from 1937 to the mid-1950s and was revived in 1981. To this day, the charity based archive records everyday life in Britain using volunteer panel writers who fill in questionnaires three times a year and also keep diaries. Like the interviews in the British Library Sound Archive, the data are not anonymized. As Courage put it, the broad consent of study participants allows personal data to be used to reconstruct long-term life stories.

In his comment, Pascal Siegers stressed the value of historical research on socialization in schools and other institutions, arguing that historians could enrich the debate in the social sciences. He questioned the reliability of oral history sources, pointing to their subjectivity. In response, LUTZ RAPHAEL (Trier) remarked that oral history interviews could help to reconstruct processes of subjectivization.

The final panel started with a presentation by ALEXANDER NÜTZENADEL (Berlin) about the impact of the “behavioural turn” on economic history. He used examples from the German Research Foundation funded programme “Experience and Expectation” to explain how reused social science generated data sets from large-scale surveys can be combined with techniques like “distant reading” of traditional sources such as newspapers to investigate how interactions between individual preferences, beliefs, and economic expectations lead to economic decisions. This historicization of expectations not only raises methodological questions but also leads to practical problems related to the long-term storage and accessibility of the research data produced. To manage and store the data, the programme has partnered with the Berlin State Library to design an infrastructure based on MyCoRe, which is free, open-source software for the development of data repositories.

The central question of the joint presentation by BENOÎT MAJERUS and LARS WIENEKE (both Esch-sur-Alzette) was how clandestine global and local networks of tax evasion can be identified by methods of data extraction from the public register of companies in Luxembourg. The main goal of their project is to identify and analyse networks of individual actors who registered companies. Although the registry is available in standardized PDF documents, named-entity extraction and a data-based understanding of these networks poses complex methodological questions. The data will permit an understanding of how networks for tax evasion developed in Luxembourg from the beginning of the twentieth century.

Lastly, MICHAEL WHITTALL (Erlangen) outlined a sociological project that revisits interviews with East German works councils conducted in the early 1990s. These historical interviews will be compared with recent sources on works councils in selected companies that are still in existence. Whittall and his colleagues aim to reconstruct changing perceptions of works councils in relation to factors such as qualifications or length of service. Like all projects represented in the workshop that reuse qualitative data produced by research on the transformation of the 1990s, this project faces data challenges such as accessibility, ethical and ownership questions, difficulties of researching historical production contexts, and issues of long-term storage.

According to Lutz Raphael, who commented on the last panel, all presentations illustrated that the old division of labour between sociology and history is becoming obsolete, not only because of new sources but also because of changing research methods. The search for weighted factors of causality is increasingly giving way to the search for patterns, meaning, process and agency. Even though the presentations touched on different subjects and sources, Raphael proposed the category of historical experience as a unifying point that could connect different branches of research. At the same time, he critically pointed to the emergence of a methodological gap produced by computing processes that are no longer fully understood by (most) historians. In response, ANDREAS FICKERS (Esch-sur-Alzette) described Digital Hermeneutics as a common space where data, tools, and infrastructure are shared. In his view, historians are now experiencing nothing less than a turning point in the history of science that is fundamentally changing epistemic traditions.

In his concluding comment, Fickers suggested four different modes of reusing social science generated data: re-(dis)covery, re-interpretation, re-contextualization, and re-enactment. The first aspect involves historians applying techniques such as retro digitalization, the annotation of metadata, and restoring data that used to be the typical domain of archives or libraries. Reinterpreting data means using new digital tools that not only empower historians but also limit historical knowledge production. Reflecting on the opportunities and limits of digital methods, Fickers pointed to “tool criticism” as a new historical instrument that can help to narrow the methodological gap. He argued that recontextualizing data also poses ethical questions that can include disfiguring meaning, while indexation processes can also have excluding effects. To deal with issues arising from reframing sources in a digital environment he suggested engaging in practices of “ethical editing” and interface criticism to understand how data sets are (re)presented on digital platforms. His last point on re-enacting referred to the materiality of data sets and the knowledge that is embedded not only in the physical data sets but also in the machines processing them.

The final discussion showed that there was no common understanding of how the terms “use” and “reuse” should be differentiated. But there was agreement that social science data sets are valuable sources that must be secured, archived, and made accessible. Von Hodenberg pointed out that there is a lack of international data infrastructure, although scientific knowledge production is increasingly dominated by international co-operation. Siegers explained this in terms of the specialization of nation-based scientific communities that demand infrastructure which fits their needs. Fickers, on the other hand, pointed to international standards such as the Europeana metadata scheme that not only enable interoperability, but also make archived data sets findable. In the end, the workshop showed that the reuse of data sets by contemporary historians is a dynamic field characterized by decentralized infrastructure and a broad variety of sources, tools, and approaches. It became clear that the collection, organization, and interpretation of social science generated data sets will continue to be a task for years to come.

Workshop overview:

Welcome and Introduction

Panel 1: Re-use of qualitative / life history interviews
Chair: Pascal Siegers (Cologne)

Jane Grey (Maynooth): Re-visiting social science data to understand social change

Clemens Villinger (Cologne): Researching Consumer Responsibility in East Germany and the responsible re-use of qualitative social data

Mary Stewart (London), Charlie Morgan (London): Archiving oral history to enable re-use

Kerstin Brückweh (Berlin): Comment

Panel 2: Survey data as sources for social history
Chair: Sabine Reh (Berlin)

Mor Geller (Jerusalem):Cinema Research and Socialist Imaginations: East German film audience research as historical source

Marcus Böick (Bochum): Historical Research as Investigative Journalism? The Quest for a Lost Interview Project

Moritz J. Feichtinger (Bern): “The greatest social-science laboratory we have ever had!”: Computational psychographics during the War for South Vietnam and its remains

Christina von Hodenberg (London): Comment

Panel 3: Writing educational history using social data
Chair: Kerstin Brückweh (Berlin)

Irena Saleniece (Daugavpils): Latvian Teachers of the 1940s–1960s: The Use and Reuse of Oral Evidence

Sabine Reh (Berlin), Eckhard Klieme (Frankfurt): Teaching cultures in the 1990s after German reunification: The political and educational importance of differences between the West and the East

Fiona Courage (Sussex): Mass-Observing education: interpreting other people’s lives

Pascal Siegers (Cologne): Comment

Panel 4: Social Data in Economy and Labour history
Chair: Clemens Villinger (Berlin)

Alexander Nützenadel (Berlin): Economic History and the Behavioral Turn: What can we learn from surveys and other social data?

Benoît Majerus (Esch-sur-Alzette), Lars Wieneke (Esch-sur-Alzette): Making shell companies visible. Digital history as a tool to unveil global networks and local infrastructures

Michael Whittall (Erlangen): Historical Co-determination Developments in the Eastern part of Germany: The Question of Data

Lutz Raphael (Trier): Comment

Andreas Fickers (Esch-sur-Alzette): Concluding Comment

Discussion Chair: Christina von Hodenberg (London)